Regex for Beginners ✳️

Published at Dec 22, 2024

# regex# programming

What is Regex? 🤔

Regex is a tool for matching patterns in strings. The syntax for regex involves two forward slashes (/) with the pattern in between, followed by optional flags that modify its behavior.

/pattern/flags


Flags 🚩

Flags change how regex behaves:

  • Case Insensitive (i): Matches without considering case.
  • Global (g): Finds all matches instead of stopping at the first.

/the/gi

Matches “the” in any case, globally.


Literal & Metacharacters 🔡🔣

Literal Characters 🔡

Regex can match literal characters for instance the regex:

cat

Matches occurrences of “cat” in a string.

Metacharacters 🔣

Metacharacters are special characters with specific meanings:

  • * (wildcard): Matches zero or more occurrences.
  • . (dot): Matches any single character.

If you want to use the literal value of a metacharacter, escape it with a backslash (\).


Quantifiers 🧮

Quantifiers specify how many times a pattern should match:

  • *: Zero or more times.
  • +: One or more times.
  • ?: Zero or one time (optional).
  • {n}: Exactly n times.
  • {n,}: At least n times.
  • {n,m}: Between n and m times.

With * by matching zero times we can match an empty string. Because we literally “match” nothing

Examples 📝

matches

"a", "aa", "aaaaaaaaaaaaa" (many times) or an empty string

a*

matches

"a", "aa", "aaa", or "aaaa"

a+

matches

"aa", "aaa", or "aaaa"

a{2,4}

matches

"ha" or "hay"

hay?

Greedy 🤑 vs. Lazy Matching 😴

  • Greedy Matching: Matches as much as possible.
  • Lazy Matching: Matches as little as possible by adding ?.

Examples 📝

When looking at the sentence:

The quick brown fox jumps over the lazy dog.

Greedy

matches

The quick brown fox jumps over the lazy do

T.*o

Lazy

matches

The quick bro

T.*?o

Bracket Expressions

Bracket expressions match specific characters:

  • [abc]: Matches “a”, “b”, or “c”.
  • [a-z]: Matches any lowercase letter.
  • [A-Z0-9]: Matches uppercase letters or digits.
  • [^abc]: Matches anything except “a”, “b”, or “c”.

Example

  • [a-zA-Z] matches any letter.
  • [0-9] matches any digit.

Character Classes

Shorthand for common patterns:

  • \d: Matches digits ([0-9]).
  • \w: Matches word characters ([a-zA-Z0-9_]).
  • \s: Matches whitespace.
  • \D, \W, \S: Match the inverse.

Anchors

Anchors match specific positions in a string:

  • ^: Start of a string.
  • $: End of a string.
  • \b: Word boundary.

Example

  • ^The matches “The” at the start of a string.
  • end$ matches “end” at the end of a string.

Groups and Alternation

  • Capturing Groups: Use parentheses to group patterns.
    • Example: (fox|dog) matches “fox” or “dog”.
  • Alternation: Use | for logical OR.
    • Example: cat|dog matches “cat” or “dog”.

Lookaheads and Lookbehinds

  • Lookahead: Matches based on what follows.
    • Positive: (?=...)
    • Negative: (?!...)
  • Lookbehind: Matches based on what precedes.
    • Positive: (?<=...)
    • Negative: (?<!...)

Example

  • \d(?=px) matches digits followed by “px”.
  • (?<=\$)\d+ matches digits preceded by ”$“.

Escaping Special Characters

To match special characters literally, escape them with a backslash (\).

Example

  • \. matches a literal dot.
  • \$ matches a literal dollar sign.

Practical Example: Matching an IP Address

Regex

d{1,3}(.d{1,3}){3}

Explanation

  • \d{1,3}: Matches 1-3 digits.
  • \.: Matches a literal dot.
  • {3}: Repeats the previous group 3 times.

Combining Concepts

To explicitly match an IP address:

  • Use ^ and $ to anchor the pattern.

  • Example:

    ^d{1,3}(.d{1,3}){3}$

Conclusion

Regex is a powerful tool for pattern matching. By combining concepts like quantifiers, groups, and anchors, you can create complex patterns to solve real-world problems.

If you found this tutorial helpful, leave a like or comment with your favorite regex use case. Stay curious and keep learning!


Crispyfish Tech © 2025